Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DSL] Refactor effective data retention #105750

Merged
merged 7 commits into from
Mar 5, 2024

Conversation

gmarouli
Copy link
Contributor

@gmarouli gmarouli commented Feb 22, 2024

In this PR we refactor DataStreamLifecycle to facilitate the data stream lifecycle global settings. We introduce two methods that retrieve the data retention. Currently, they have the exact same implementation but in the future they will differ.

  • getDataStreamRetention: this is the retention a user configured on the data stream level upon creation or via the PUT _data_stream/{data-stream}/_lifecycle API.
  • getEffectiveDataRetention: This is the retention that will be applied by DSL on the data stream. Right now it's just calling the getDataStreamRetention but in the future it will take into consideration also the global retention.

The benefit of this work is to enable us to semantically distinguish the two retentions in a manageable PR.

@gmarouli gmarouli added >refactoring :Data Management/Data streams Data streams and their lifecycles labels Feb 22, 2024
@gmarouli gmarouli requested a review from masseyke February 22, 2024 16:42
@elasticsearchmachine elasticsearchmachine added v8.14.0 Team:Data Management Meta label for data/management team labels Feb 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 4, 2024

@elasticmachine update branch

IndexMetadata.DownsampleTaskStatus downsampleStatus = INDEX_DOWNSAMPLE_STATUS.get(backingIndex.getSettings());
// we don't want to delete the source index if they have an in-progress downsampling operation because the
// target downsample index will remain in the system as a standalone index
TimeValue effectiveDataRetention = dataStream.getLifecycle().getEffectiveDataRetention();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious if there's a reason you moved this inside the for loop -- my understanding is that it's going to be the same for every index in the datastream, right? I assume this will always be a fairly cheap calculation, but any reason to do it repeatedly?

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I had one minor question I put in the code. I also notice that there are no tests for getEffectiveDataRetention, but I assume that those will come once getEffectiveDataRetention actually does something different from getDataStreamRetention.

@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 5, 2024

@elasticmachine branch update

@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 5, 2024

Existing test failure: #105922

@gmarouli gmarouli merged commit d65461d into elastic:main Mar 5, 2024
14 checks passed
@gmarouli gmarouli deleted the refactoring-for-effective-retention branch March 5, 2024 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >refactoring Team:Data Management Meta label for data/management team v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants